Psych221 / EE362 project
Alex Giladi
Bitrate reduction while keeping high video quality is the most important requirement of content providers (e.g. CATV, IPTV, satellite television), and only such a reduction can justify the expence of replacing old encoders by more modern ones and set-top boxes by ones that support a wider amount of features. Especially with real-time encoders moving from firmware to ASIC (which is produced by few semiconductor companies), successful video pre-filtering can significally improve an encoder manufacturer's competitive edge in this industry.
All video compression standards, ratified by ISO and ITU-T since late '80's, compress video in transform domain. They further apply source coding techniques to the quantized transform coefficients and motion vectors. Though, especially in low-bitrate low-quality streams, motion vectors can be a significant part of the bits, in most cases the transform coefficients.
Spatial pattern sensitivity of the human visual system is limited, however video acquisition devices might not be comletely aware of tha its properties. Moreover, if some noise is added to the video source, it will often be concentrated in the high frequencies. Therefore, eliminating high spatial frequencies is important for successful video coding. Though a lot of this job is done at the quantization stage by the encoder itself, using pattern sensitivity features can reduce bitrate while resulting in visually indistinguishable (or -- even -- more pleasing) video.
In [ZW97] pattern sensitivity of the human visual system is exploited for the S-CIELAB metric (an enhancement of the CIELAB dE metrics). In S-CIELAB, prior to calculating the actual distance measure in the CIE LAB color space, an image was filtered in an opponent color space [P93], with filter parameters corresponding to the sensitivity of each channel (luminance, red-green, blue-yellow).
Temporal sensitivity patterns of the human visual system were also used in ST-CIELAB [TX99] for enhancing CIELAB dE. However, that metric was not motion-compensated. In [DS84], [BO92] a non-linear motion-compensated adaptive filter is suggested.
In this project I attempted using the spatiotemporal sensitivity properties of the human visual system for video pre-processing prior to encoding using the recent H.264/AVC compression standard [H264]. Same filters that were used for image quality metrics, could be used for prefiltering

A chain of two independent pre-processing units was built, where temporal filtering operated after the spatial one. Both units are described in more detail below.


4 VQEG (Video Quality Expert Group) standard 8-second NTSC (525-line) reference sequences were used (available from the VQEG site). The sequence format was according to the ITU-R BT.656 specification.
The reason for selecting 4:2:2 standard definition sequences was to give more weight to the chroma channels (the ideal setting thus would be 4:4:4, however free 4:4:4 high quality sources were not easily available even within the JVT community). The BT.656 format is defined as 720x486. The effective picture size is 704, and thus the rightmost 16 pixels and the lowest 6 lines were ignored for the filtering purposes.
![]() |
![]() |
![]() |
![]() |
City |
Mobile |
Football |
Susie |

Spatial filtering resulted in a very significant reduction of bitrate, especially at higher rates / lower quantizer values. Such an influence is unsurprising, since we are eliminating the high-frequency coefficients that would not have been eliminated when fine quantization settings are used. The resulting bitrate gain seems to be very high, though I believe it will be somewhat more modest if the encoder settings are better optimized (e.g. rate control, large high-quality motion estimation).

[H264] ITU-T Recommendation H.264: Advanced video coding for generic audiovisual services
[PW93] A. Poirson, B. Wandell, Appearance of colored patterns: pattern-color separability, Journal of the Optical Society of America, 10(12), 2458-2470.
[ZW97] X. Zhang, B. Wandell, A spatial extension of CIELAB for digital color image reproduction, SID Journal 1997, 5(1), pp 61-64
[TH99] X. Tong, D. Heeger, C. Van den Branden Lambrecht Video quality evaluation using ST-CIELAB,Proceedings of SPIE -- Volume 3644, May 1999, pp. 185-196
[DS84] E. Dubois, S. Sabri, Noise Reduction In Image Sequences Using Motion-Compensated Temporal Filtering T-COMM(32), pp. 826-831, 1984.
[BO92] J. Boyce, Noise reduction of image sequences using adaptative motion compensated frame averaging, IEEE ICASSP, volume 3, pp. 461-464, 1992.
[WA95] B. Wandell, Foundations of vision, Sinauer Associates, 1995
[PY03] C. Poynton, Digital Video and HDTV Algorithms and Interfaces, Morgan Kaufmann, 2003